Add fixed-trajectory system tests with cross-track error metrics#365
Open
pvkumara wants to merge 62 commits into
Open
Add fixed-trajectory system tests with cross-track error metrics#365pvkumara wants to merge 62 commits into
pvkumara wants to merge 62 commits into
Conversation
New tests/test_fixed_trajectory.py evaluates drone performance on Circle, Figure8, Racetrack, and Line trajectories: takeoff -> execute -> land with cross-track error, path RMSE, execution time, and success metrics recorded to metrics.json for baseline comparison. - Python ideal-path generators mirror fixed_trajectory_task.cpp equations - Cross-track error uses robot pose snapshot at dispatch to transform base_link ideal path to world frame for odom comparison - 5m loose tolerance documents the known circle failure without stranding drone - conftest.py gains --trajectory-types CLI option and generalised phase-order sorting/ID-rewriting for both autonomy test modules - tests/README.md documents the new module, all 11 metrics, and run commands Made-with: Cursor
Made-with: Cursor
* Add link to PAT * Change to new orchestrator instance workflow * Add availability zone * Bump version to 0.18.0-alpha.7
…ge build tests for ci/cd
…ding docker images
…so we can manually trigger pytest by commenting /pytest
…m tests to include build_packages
* added option for physics step frequency * reverted example launch script * patches PX4 simulation startup script and fixes robot DDS version * set default physics Hz for PX4 to be 100Hz which is the minimum. * reverted simulation changes * updated docs * Better error logging for ci/cd orchestrator * Add check system resources before spawning server; if resources not available, report back and try again later * added option for physics step frequency * added option for physics step frequency * removed physics frequency from .env and set working PX4 values in docker-compose defaults. * removed unnecessary benchmarking from AirStack launch scripts. --------- Co-authored-by: Andrew Jong <ajong@andrew.cmu.edu>
Update pull request template with versioning guidelines Added guidelines for versioning in the pull request template. Update pull request template for media uploads Clarified instructions for adding videos and images in the PR template.
* Update PegasusSim lidar to new rtx lidar and optional min_sensor_range parameter to vdb model to avoid self-detection. * removed deprecated ouster lidar. Completely integrated new rtx lidar * renaming frame id back to ouster * Added node to filter near and invalid lidar points * reconciled topic names for lidar point cloud * fixed example scripts to use rtx lidar api * fixed tmux closing and rclpy path issue * uses add_rtx in multi px4 script * bumping version index * docs added * unit testing and documentation updates * cleaning code from copilot suggestions * docs(tests): fix pytest marker example for running liveliness and sensors Agent-Logs-Url: https://github.com/castacks/AirStack/sessions/7ce7609a-a7f3-414d-9d42-0c9999d0459f Co-authored-by: andrewjong <8121216+andrewjong@users.noreply.github.com> * docs(tests): fix marker semantics in test_sensors module docstring Agent-Logs-Url: https://github.com/castacks/AirStack/sessions/bdf00f6f-1d9f-4597-bf57-b96f99421646 Co-authored-by: andrewjong <8121216+andrewjong@users.noreply.github.com> * addressing github copilot concerns * docs(bridge): remove stale camera topics comment Agent-Logs-Url: https://github.com/castacks/AirStack/sessions/2d5718ac-20e3-4f10-a12e-05d601cf000c Co-authored-by: JohnYanxinLiu <63010779+JohnYanxinLiu@users.noreply.github.com> * addressing copilot concerns * removing debug print statement from reading point cloud Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(isaac-sim): align drone1 lidar prim path with spawned prim Agent-Logs-Url: https://github.com/castacks/AirStack/sessions/fbad2b9c-1761-45b1-b464-3e874511255c Co-authored-by: JohnYanxinLiu <63010779+JohnYanxinLiu@users.noreply.github.com> * more succint comment in sim bashrc * resolving discrepant comments in ros bridge yaml * removed bug allocated new copy of point cloud array * logs lidaar test with boolean instead of hz * and --> or for marks --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: andrewjong <8121216+andrewjong@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Andrew Jong <ajong@andrew.cmu.edu>
* Fixed multi-drone global plan
* added sep files for fire and retro
* added robot2 relative pos; diff rviz files; bridge for rayfronts topics
* added sharing of semantic rays
* changed rviz for both drones
* added target sharing
* changed drone start pos
* gossip layer w/o relay
* added global coords under /{ROBOT_NAME}/interface/mavros/global_position/raw/fix(not my topic, it was already publishing to that)
* gossip, threedrone,peerprofile
* multi drone vis in foxglove, odom doesn't work in foxglove yet
* multi drone vis in foxglove works with odom
* global plan added
* added image, vdb markers(not transformed yet)
* fixed state estimation flickering and vdb transform
* added custom foxglove buttons for commands
* added modular payloads to peerprofile, foxglove reads the payloads and vizualizes it,currently works for rayfronts
* fixing the rotation of payload
* syncing devices
* fixed gossip + translate
* added skill for foxglove/coordination
* removed VDB ENV
* rebase with main
* updated docs
* fixed launch files so they have play start on sim. scene_prep utils: added non-world prims to save in flattened manner
* created raven_nav package
* moved coordination to common
* fixed gcs<->robot dds
* added hitl functionality
* fixes to dds
* put dds hitl under gcs
* fixes to robot hitl
* syncing both computers
* mimiced robot-l4t for dataflow
* fixed path to ddsrouter_yaml
* fixed dds server
* fixed two_drone_fire
* rayfronts is now a ros package
* added feedback, it's sending success too early though
* fixed raven behavior
* foxglove panel with working executors
* random walk fixed
* fixed random walk bringup. Added saves and viz for multiple waypoints and polygons
* fixed bounds for exploration task, combined waypoint/polygon editor into task panel
* made waypoint/polygon gui larger
* added 2d map to foxglove
* WIP: pre-merge snapshot
* added changes from main
* WIP: pre-branch-split snapshot
* PR for foxglove+multi-robot
* merged with main
* PR cleanup: revert unrelated changes and drop extra files
- Restore main's robot.rviz (drop redundant robot_1/robot_2.rviz)
- Restore ms-airsim include in root docker-compose.yaml
- Restore airsim sections in docs/simulation/index.md
- Restore docs/gcs/docker/index.md (VERSION env name)
- Restore robot/docker/{.bashrc, Dockerfile.robot} to main
- Restore SIM_IP in robot/docker/docker-compose.yaml
- Restore takeoff_landing_planner takeoff_height: 8.0
- Drop docs/action_bridging.md (internal design memo)
- Drop personal launch scripts (two_drone_fire*, three_drone_scene_import, two_drone_RetroNeighbourhood)
- Trim verbose comments in gps_utils.py and example_multi_drone_scene_import.py
* Trim noisy inline comments in PR-added Python files
* Pin vdb_mapping_ros2 to public main (was at unpushed 68fe8dde)
* fixed launch script
* fixed foxglove bugs, added dynamic fg layout, updated docs
* fixed bugs found by copilot. Removed rviz by adding a node
* fixed comment
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fixed path in skill
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fixes from copilot
* Fix Pegasus submodule pointer after merge
Advance to 8e01d013 (main's pointer) which contains spawn_rtx_lidar.py,
required by example_one_px4_pegasus_launch_script.py and the multi
script after the rtx-lidar update merged from main.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(coordination): align gossip with steady clock + manifest hygiene
- gossip_node: swap startup log + outgoing-stamp clock to STEADY_TIME so
the dedup-by-stamp invariant survives /clock pauses; subscribe to
/global_position/global to match foxglove_visualizer and action_relay
- gossip_node docstring: drop the false "waypoint triggers immediate
publish" claim
- coordination README: rename peer_registry node block to the actual
per-robot registry topic; "wall-clock" -> "steady"
- package.xml: add missing exec/depend rules
- coordination_bringup -> autonomy_bringup
- autonomy_bringup -> coordination_bringup
- desktop_bringup -> coordination_bringup, gcs_visualizer
- gcs_visualizer -> std_msgs, coordination_msgs, coordination_bringup
- task_msgs: replace TODO license with BSD-3-Clause
- gcs.launch.xml: comment had `--no-sandbox` (`--` is illegal inside an
XML comment and crashed the ROS launch parser)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(gcs+autonomy): drop dead BT panel, lint payload imports, name-map override
- payload_visualizer_node: remove unused PointCloud2 / transform_point_cloud2
imports (F401), collapse Marker/MarkerArray
- action_relay launch: ROBOT_RELAY_MAP env override for non-default
robot_name -> domain mappings (default behavior unchanged)
- desktop_bringup robot.rviz: drop BehaviorTreePanel entry pointing at
/behavior/behavior_tree_graphviz (publisher package was removed)
- autonomy_bringup domain_bridge: bridge /global_position/global to match
the dds_router and the rest of the stack (was /raw/fix)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(foxglove): clean panel-id stacking, atomic render, drop dead .foxe
- render_layout: regex now strips every trailing _r<n> (was: only the
last one), fixes _r1_r1_r1... stacking on repeated runs
- render_layout: atomic write via tmp + os.replace so a partial
json.dump doesn't corrupt the layout file
- airstack_default.json: re-render with fixed stripper to commit a
clean source template (no stacked _r1 suffixes)
- install.sh -> install.py: file is Python, shebang is python3
- install.py: slugify publisher into the on-disk extension dir name
so "AirLab CMU" doesn't produce a directory with a space
- drop robot-commands/robot-commands.foxe (duplicate; canonical is at
foxglove_extensions/robot-commands.foxe) and the .foxe.bak
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* bug fixes
* bug fixes
* version
* reverted env
* updated gitignore and docs
* updated foxglove viz + consistent spellings across repo
* Move layout file to /root/ so it's immediately accessible, also fix template path
* Change so that file name reflects NUM_ROBOTS
* Add a DEBUG_RVIZ flag to launch robot rviz if needed
---------
Co-authored-by: krrishj18 <krrishj18@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Andrew Jong <ajong@andrew.cmu.edu>
* fixes to scene_prep_utils.py * edited docs * clean launch script * updated version * fixed comments inconsistency and typos * formatting fix Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * bug in gossip if payload is empty * fixed omni_pass.env file creation bug from CICD guest default profile * fixed depth topic naming in foxglove gcs * changed gps topic * removed redundant exntentions --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: airlab <johnliuchs2022@gmail.com>
…te develop from main
* feat(osmo): VS Code/Cursor dev workflow on NVIDIA OSMO
Adds a privileged Docker-in-Docker workspace task that lets a developer
run the full AirStack docker-compose stack on OSMO and attach an IDE
over SSH, with Isaac Sim WebRTC livestream + Foxglove websocket exposed
via osmo port-forward.
Components:
- osmo/workspace/{Dockerfile,entrypoint.sh,sshd_config}: airstack-osmo-workspace
image. Ubuntu 24.04 + sshd (pubkey-only) + Docker CE + Docker Compose +
nvidia-container-toolkit + fuse-overlayfs (DinD-on-overlayfs needs it,
otherwise dockerd falls back to vfs which bloats AirStack images ~10x).
- osmo/workflows/airstack-dev.yaml: single privileged GPU task. Materializes
Nucleus + airlab-docker secrets from OSMO credentials, clones AirStack,
starts inner dockerd, runs `airstack up` with desktop + isaac-sim-livestream
Compose profiles.
- simulation/isaac-sim: isaac-sim-livestream Compose service that runs
Pegasus standalone with --/app/livestream/enabled=true and exposes
WebRTC port ranges 47995-48012 / 49000-49007 / 49100; launch script
gates headless+livestream extension on ISAAC_SIM_LIVESTREAM env var.
- .airstack/modules/osmo.sh: airstack osmo:{up,ide,foxglove,webrtc,logs,down}
CLI wrappers around `osmo workflow submit` / `port-forward` / `cancel`.
Persists the active workflow id and validates it's still running before
each command (prevents the stale-state 410 error).
- airstack.sh: bash 4+ re-exec bootstrap (macOS ships 3.2; the CLI uses
`declare -A`).
- osmo/README.md + docs/tutorials/airstack_on_osmo.md: admin pool setup
(privileged_allowed) + per-user credentials (airlab-docker-login,
airlab-nucleus) + student-facing IDE attach + WebRTC/Foxglove flow.
Pool requirements: privileged_allowed: true, GPU pool with
nvidia-container-toolkit on the host, ample node ephemeral storage
(AirStack images extracted are ~50-100Gi via fuse-overlayfs; vfs needs
~500Gi+).
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): harden CLI + workspace image against stale-state, port-forward race, and cursor-server install hangs
Four bugs that bit the first end-to-end runs (airstack-dev-10 → -13):
- _osmo_wf_id: validate saved workflow id against `osmo workflow query`
before returning. Without this, the state file at ~/.airstack/osmo-state
outlives the workflow it points at and every subsequent osmo:webrtc /
osmo:foxglove / osmo:ide call surfaces the same confusing
"Workflow airstack-dev-N is not running! (status 410)" instead of the
obvious "run airstack osmo:up to launch a fresh workflow".
- cmd_osmo_up: `osmo workflow submit --set-env` is variadic. Passing two
separate `--set-env A=1 --set-env B=2` silently drops the first one —
this is what made airstack-dev-11 fail with "ERROR: SSH_PUB_KEY not set"
when --branch was passed alongside the pubkey. Collapse the K=V pairs
into a single --set-env.
- cmd_osmo_ide: previously launched the IDE before starting the
port-forward, so Cursor/VS Code would try to SSH localhost:2200 a few
hundred ms before the tunnel listener existed and fail with
"connect to host localhost port 2200: Connection refused". Now: detect
an existing forward and reuse it (also avoids the "Address already in
use" if osmo:foxglove was started in parallel), otherwise spawn the
forward in the background, wait up to 30s for it to bind, then launch
the IDE. Ctrl+C tears down the spawned forward cleanly via a trap.
- workspace image / entrypoint: Cursor Remote-SSH hung indefinitely
on airstack-dev-13 because (a) cursor-server's installer fell back to
wget when curl timed out and wget was not in the image, and (b) a
/tmp/cursor-remote-lock.* file left behind by the first crashed
install blocked every silent retry. Add wget to the apt install list
and rm -f the stale Cursor / VS Code remote lock files at the very
top of entrypoint.sh so each fresh pod starts from a clean slate.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): correct osmo:logs CLI invocation; install Foxglove extensions locally on osmo:foxglove
osmo:logs was invoking `osmo workflow logs <id> workspace --follow`, but
the real CLI takes the task via `-t TASK` (not positionally) and has no
`--follow` flag at all — so the command failed immediately with
"unrecognized arguments: workspace --follow". Replace with a polling loop
that uses `-t workspace -n <N>` on a short interval, prints only the
suffix that appeared since the previous fetch (find-the-last-seen-line
trick; degrades to "reprint tail" with a warning if the cursor outruns
-n), and exits cleanly once the workflow reaches a terminal state.
Tunables: OSMO_LOGS_TASK / OSMO_LOGS_TAIL / OSMO_LOGS_INTERVAL.
osmo:foxglove now installs the AirStack Foxglove extensions
(robot-commands / waypoint-editor / polygon-editor) into the laptop's
local Foxglove user-extensions directory before opening the
port-forward. Without this, custom panels show up as "Unknown panel
type: robot-commands.Robot Tasks" in the laptop's Foxglove Desktop
because it has no way to discover the extension folders that live
inside the GCS container. To avoid duplicating the install logic, the
existing gcs/foxglove_extensions/install.py is refactored to read
FOXGLOVE_EXT_SRC / FOXGLOVE_EXT_DST env vars (the in-container call
already in gcs/docker/gcs-base-docker-compose.yaml keeps working
unchanged via defaults). The wrapper sets those vars to
${PROJECT_ROOT}/gcs/foxglove_extensions and
~/.foxglove-studio/extensions respectively, overridable with
OSMO_FOXGLOVE_EXT_DIR / skippable with OSMO_FOXGLOVE_SKIP_EXTENSIONS=1.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): pin Kit livestream UDP media port to 49099 so osmo:webrtc actually shows pixels
Kit 107's WebRTC livestream picks a UDP media port dynamically. The
documented `omni.services.livestream.nvcf` defaults (minHostPort=47998
maxHostPort=48020 fixedHostPort=0) are ignored by the stock standalone
Kit binary — on airstack-dev-13 it bound to UDP 49042, outside both the
Compose-published range AND the default `osmo:webrtc --udp` forward of
`47995-48012,49000-49007`. Result: TCP signaling on 49100 worked, the
WebRTC Streaming Client window opened, but every SRTP media packet was
dropped → black viewport plus the recurring
`NVST_CCE_DISCONNECTED when m_connectionCount 0 != 1` underflow in Kit's log.
Pin the media port via three `app.livestream.*` settings set on
`SimulationApp` before `omni.kit.livestream.webrtc` is enabled, so
whichever code path the carb.livestream-rtc.plugin consults lands on the
same port:
app.livestream.fixedHostPort = 49099
app.livestream.minHostPort = 49099
app.livestream.maxHostPort = 49099
49099 is a deliberate one-off from the 49100 TCP signaling port — same
neighborhood, easy to remember. Verified live on airstack-dev-13 after
`docker compose up -d --force-recreate isaac-sim-livestream`: Kit binds
UDP 49099 (`/proc/net/udp` hex BFCB on 0.0.0.0) and docker-proxy
publishes it from the pod host network.
Knock-on cleanups:
- `simulation/isaac-sim/docker/docker-compose.yaml` shrinks the
isaac-sim-livestream `ports:` from 27 forwarded ports
(`47995-48012, 49000-49007 TCP+UDP, 49100 TCP`) to just two:
`49100/tcp` + `49099/udp`.
- `.airstack/modules/osmo.sh` shrinks `OSMO_WEBRTC_TCP` to `49100` and
`OSMO_WEBRTC_UDP` to `49099`, so `airstack osmo:webrtc` spawns two
port-forwards instead of thirty.
- `.gitignore` ignores `.DS_Store` so working from a Mac doesn't leak
Finder metadata.
After pulling this commit into a running pod: `docker compose up -d
--force-recreate isaac-sim-livestream` to apply the new port mapping;
then re-run `airstack osmo:webrtc` on the laptop to pick up the new
forward ranges. The standalone WebRTC Streaming Client connects to
`localhost` (same address as before) and now actually receives frames.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): render Kit GUI in WebRTC stream; document SSH agent forward for in-pod git push
Two paper-cuts that bit airstack-dev-13 after the WebRTC media port pin
landed (commit 2d9b161):
(1) The WebRTC stream showed only the bare 3D viewport — no menu bar,
no toolbar, no panels, no console. Cause: SimulationApp's default
when `headless=True` is to also hide the UI (`hide_ui=True`). The
NVIDIA reference at
`simulation/isaac-sim/standalone_examples/api/isaacsim.simulation_app/livestream.py`
explicitly opts back into UI rendering plus picks explicit window
sizing and `display_options=3286` to keep the default grid/axes
visible. Mirror that config in `example_one_px4_pegasus_launch_script.py`
when `ISAAC_SIM_LIVESTREAM=true` (local desktop dev keeps the
minimal `headless=False` path unchanged).
(2) The pod has no SSH private key, only an `authorized_keys` for
inbound connections from the user's laptop. As a result, `git push`
from inside the Cursor / VS Code Remote-SSH session inside the pod
fails with "Permission denied (publickey)". sshd inside the
workspace image already has `AllowAgentForwarding yes` baked in via
`osmo/workspace/sshd_config`; the missing piece is purely on the
Mac side. Update the `~/.ssh/config` block in the tutorial to
include `ForwardAgent yes` (so the local agent's keys are exposed
in the pod), `AddKeysToAgent yes` (auto-load on first push), and
`UseKeychain yes` (macOS-only Keychain unlock without passphrase
prompts; ignored on Linux). Adds an `ssh-add -l` smoke-test note.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): make osmo:setup idempotent + paste-safe; document Nucleus auth-debug path
osmo:setup hit two failure modes that wasted a debug session each:
- `osmo credential set` is not an upsert for GENERIC creds — re-running
setup (e.g. to rotate a Nucleus API token) failed with `400 duplicate
key value violates unique constraint "credential_pkey"` and bailed
before reaching the airlab-nucleus credential. Delete-then-set each
credential so re-running is idempotent.
- Bracket-paste mode and cross-OS clipboards routinely smuggle invisible
bytes around long pastes. Nucleus's auth endpoint silently DENIES a
token with one extra trailing byte, with no actionable error from the
client side. _osmo_prompt now strips leading/trailing whitespace and
CR/NUL bytes via a new _osmo_trim helper, and warns when bytes were
stripped. cmd_osmo_setup additionally JWT-shape-checks the Nucleus
token (must be eyJ.<dot>.<dot>.) before submitting it, so a wrong
paste fails at setup time instead of silently DENIED at pod boot.
Also documents how to debug the "Login Required: Unable to connect
server omniverse://airlab-nucleus..." popup: SSH the Nucleus host and
tail base_stack-nucleus-auth-1 for InternalCredentials.auth status:
DENIED. Adds a "Nucleus connectivity from OSMO" section to the admin
README clarifying that Nucleus over HTTPS uses a single 443 (no need
to open the native 3009-3180 range from the OSMO cluster), per
NVIDIA's TLS docs.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): use Nucleus API-token auth, with double-dollar to survive compose parser
The OSMO entrypoint was writing OMNI_USER=<andrew_id> alongside an API
token JWT in OMNI_PASS, which routes the JWT through the password-
verification path. Nucleus silently DENIES — visible only in
base_stack-nucleus-auth-1 as `InternalCredentials.auth … 'username':
'<andrew>' … status: DENIED` (no Tokens.auth_with_api_token call). Kit
then pops "Login Required: Unable to connect server omniverse://...".
omniclient expects the literal sentinel username `$omni-api-token` paired
with the JWT as the password. The entrypoint now detects a JWT-shaped
OMNI_PASS (header starts with `eyJ`) and emits OMNI_USER=$$omni-api-token
into omni_pass.env. The `$$` is intentional: docker-compose v2
interpolates env_file values, and a single `$` would be eaten by the
parser (`OMNI_USER=$omni-api-token` becomes `OMNI_USER=-api-token` after
${omni}- expansion to empty). The container ultimately sees
OMNI_USER=$omni-api-token, which is the correct sentinel.
Also note for the next debugger: `docker compose restart` does NOT
re-read env_file. Use `docker compose up -d <svc>` to recreate the
container after editing omni_pass.env.
Updates omni_pass_TEMPLATE.env header to document the API-token pattern
explicitly (with the $$ caveat), and adds a troubleshooting row that
distinguishes "wrong auth path" (DENIED with no Tokens.auth_with_api_token
call) from "bad/expired token" (Tokens.auth_with_api_token: DENIED).
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs(osmo): make OSMO the recommended dev path, single clone-the-repo flow
Reposition the OSMO tutorial as AirStack's recommended day-to-day
development path (not just a fallback for laptops without GPUs) and
collapse it onto a single recipe: clone the repo, then drive everything
through the airstack osmo:* wrappers in .airstack/modules/osmo.sh.
- docs/tutorials/airstack_on_osmo.md
- Retitle + rewrite the intro to lead with five concrete advantages
(pooled GPUs, no local CUDA/Docker/driver maintenance, same image as
CI + field robots, one-command onboarding, hardware bigger than your
laptop). Demote the Linux+GPU-desktop path to an escape hatch.
- Drop the Mac/Windows/no-GPU framing in 'Who is this for?' and the
mermaid laptop subgraph label.
- Add 'a local clone of AirStack' to Prerequisites; remove it from the
'do not need' list.
- Replace Option A/B credential split with a single
./airstack.sh osmo:setup recipe; move the three raw osmo credential
set calls into a collapsible 'Under the hood' footnote.
- Replace each step's raw osmo workflow ... command with the
corresponding airstack osmo:up/logs/ide/webrtc/foxglove/down wrapper;
preserve the raw form in 'Under the hood' footnotes that cross-link
cmd_osmo_* in .airstack/modules/osmo.sh.
- Drop the export WF=... paragraph — the wrappers read the id from
~/.airstack/osmo-state automatically; AIRSTACK_OSMO_WF overrides
per-invocation. \$WF now only appears inside the raw-form footnotes.
- Sweep Troubleshooting + What-survives tables: redirect raw
port-forward fixes to the airstack osmo:* equivalents and rename the
section to 'What survives airstack osmo:down?'.
- Fix WebRTC edge label (49100/tcp + 49099/udp) to match the pinned
ports the workflow actually uses today.
Companion cleanups now that the privileged_allowed flip is automatic on
the OSMO autosync side (synchronize_osmo_team_pools.py forces
privileged_allowed: true on every platform of every pool, so students
never see the 'platform does not have privileged flag enabled' error):
- osmo/README.md: drop the 'Most common blocker' privileged warning, the
privileged_allowed row from the pool-requirements table, and the
'privileged GPU pod' / '(privileged, GPU)' descriptors in the
architecture summary. Simplify the validation-stage SSH-failure hint.
- osmo/workflows/airstack-dev.yaml: trim the long DinD-requires-privileged
comment to a one-liner (the privileged: true directive itself stays).
- .airstack/modules/osmo.sh: remove the special-case 'privileged flag
enabled' error branch in cmd_osmo_up — it should never fire now.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): make osmo:logs actually stream + survive pod host-key churn
osmo:logs was silent because cmd_osmo_logs wrapped osmo workflow logs in
$( ... ) on the assumption that -n LAST_N_LINES exits after dumping the
tail. Empirically the CLI keeps the stream open as new lines arrive (it
already behaves like tail -f, despite --help advertising only -n), so
command substitution waited forever and printed nothing. Drop the polling
loop and just exec the command directly.
Each fresh OSMO pod also ships a new sshd host key, so every osmo:up
trips StrictHostKeyChecking against the previous workflow's fingerprint
and SSH/Cursor abort with "Host key for [localhost]:2200 has changed".
Switch the recommended ~/.ssh/config block (and osmo/README.md) to the
ephemeral-host pattern (StrictHostKeyChecking no + UserKnownHostsFile
/dev/null + LogLevel ERROR), and have cmd_osmo_ide ssh-keygen -R the
stale loopback entry on every run so users on the old config get
unblocked automatically.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(osmo): auto-pin --branch to local checkout + clean error UX when workflow dies
The pod's entrypoint clones AirStack fresh from GitHub on every workflow
start (the pod fs is ephemeral). It defaulted to `main`, so any developer
testing branch-only OSMO changes silently ran their pod against stale
`main` code — most visibly: COMPOSE_PROFILES=desktop,isaac-sim-livestream
resolved to "desktop" alone on `main` because the isaac-sim-livestream
service only exists on the feature branch, so isaac-sim never came up
and `airstack osmo:webrtc` showed a blank stream.
- cmd_osmo_up now defaults --branch to the local repo's current
branch (git rev-parse --abbrev-ref HEAD). Detached HEAD or
non-git checkouts fall back to `main` cleanly. Pass --branch
explicitly to override.
- New _osmo_check_branch_pushed warns up-front when the about-to-
submit branch has no upstream, is ahead of origin, or has an
uncommitted working tree. The pod doesn't see your laptop's edits.
Separately, when an OSMO workflow gets canceled mid-flight (osmo:down
in another shell, or OSMO timing it out), the in-flight port-forward
and logs streams raise OSMOUserError("Workflow X is not running!")
from inside an asyncio Task. The CLI prints "Task exception was never
retrieved" + a multi-line Traceback that buries the actual one-line
cause. New _osmo_pf_filter awk script collapses that into a single
[ERROR] line pointing at `airstack osmo:up`. Wired into webrtc,
foxglove, and logs. webrtc also gains a cleanup trap that kills the
backgrounded UDP port-forward on EXIT/INT/TERM so we don't leak it
against a dead workflow.
Tutorial Step 2 documents the new --branch default and the
"pod-clones-from-GitHub-not-your-laptop" gotcha.
Co-authored-by: Cursor <cursoragent@cursor.com>
* perf(osmo): bump inner dockerd concurrency to saturate 10 GbE pulls
dockerd's defaults of --max-concurrent-downloads=3 / --max-concurrent
-uploads=5 cap a fresh airstack-dev pod's image-pull at ~300 MiB/s
against the airlab-backup-10g registry — single-stream TLS tops out
around 300-500 MiB/s per core, and three parallel streams of unevenly
sized blobs serialize down to that ceiling. Ceph (1014 TiB, 92 OSDs,
SSD pools) and 10 GbE both have far more headroom than that. Bump to
10/10 to overlap enough blob downloads to saturate the pipe.
Threaded through the DOCKERD_MAX_DOWNLOADS / DOCKERD_MAX_UPLOADS env
vars so a pool can be tuned at submit time without rebuilding the
workspace image.
Workspace image needs a rebuild + push for this to take effect:
cd osmo/workspace
docker build -t airlab-docker.andrew.cmu.edu/airstack/airstack-osmo-workspace:latest .
docker push airlab-docker.andrew.cmu.edu/airstack/airstack-osmo-workspace:latest
Co-authored-by: Cursor <cursoragent@cursor.com>
* docs(osmo): require buildx --platform linux/amd64 for workspace image
A plain `docker build && docker push` on an Apple Silicon Mac silently
produces a linux/arm64-only `latest` manifest. OSMO workers are amd64,
so every subsequent workflow fails at the outer pod-image pull with
"no match for platform in manifest" before the entrypoint even runs —
a confusing failure mode whose root cause lives entirely in the push,
not in the workflow yaml or the entrypoint.
Switch the README and the Dockerfile docstring to the buildx form,
explain the why, and document the post-push manifest check.
Co-authored-by: Cursor <cursoragent@cursor.com>
* perf(osmo): move dockerd data-root to /osmo/run for native overlay2
The OSMO pod's `/` is itself a containerd overlay snapshot, and Linux
refuses to stack a second overlayfs on top of an overlay rootfs — which
is why the inner dockerd was falling through to fuse-overlayfs. That
costs a kernel↔userspace FUSE round-trip on every `creat()` during
layer extraction, which murders throughput on apt/pip/ROS layers
(measured: 32-50 MB/s for small-file-heavy layers vs 480 MB/s for
big-file layers in the same pull).
Pointing dockerd at /osmo/run/docker (the kubelet emptyDir backed by
ext4 on /dev/vda3) lets the existing overlay2-first fallback chain
actually succeed on its first try, restoring kernel-overlay extraction
performance. emptyDir lifetime matches the workflow lifetime, so the
docker layer cache gets the right scope automatically.
Falls back to /var/lib/docker if /osmo/run isn't present so the image
still works in non-OSMO test contexts.
Co-authored-by: Cursor <cursoragent@cursor.com>
* updated version
* added virtual display for GL context
* added virtual display for droan_gl
* droan_gl patch
* run Xvfb in its own tmux session
* updated dockerfile + version
* typo in docs
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* typo in comments
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* typo in comments
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* typo in comments
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* typo in osmo logs, renamed airstack-isaac-sim to just isaac-sim
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* typo in container name for isaac-sim-livestream
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* airstack-dev version overwrite removed
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: krrishj18 <krrishj@andrew.cmu.edu>
Co-authored-by: Andrew Jong <ajong@andrew.cmu.edu>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
New tests/test_fixed_trajectory.py evaluates drone performance on Circle, Figure8, Racetrack, and Line trajectories: takeoff -> execute -> land with cross-track error, path RMSE, execution time, and success metrics recorded to metrics.json for baseline comparison. - Python ideal-path generators mirror fixed_trajectory_task.cpp equations - Cross-track error uses robot pose snapshot at dispatch to transform base_link ideal path to world frame for odom comparison - 5m loose tolerance documents the known circle failure without stranding drone - conftest.py gains --trajectory-types CLI option and generalised phase-order sorting/ID-rewriting for both autonomy test modules - tests/README.md documents the new module, all 11 metrics, and run commands Made-with: Cursor
…easily see their results in one file without having to wade through a ton of log files to get what they need
…t doesn't inundate the user with a ton of log files for no reason
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
- What features did you add and/or bugs did you address?*

- Which GitHub issue does this address?
This PR does not address any GitHub issues but instead adds a new fixed-trajectory test suite for automatic path tracking error in sim.
- Additional description if not fully described in the GitHub issue
This PR adds automated fixed-trajectory evaluation tests for the autonomy stack and fixes trajectory-tracking bugs that caused path execution to fail. It also improves the test results workflow so maintainers get one readable summary file instead of many per-test logs.
https://youtu.be/zaaZqLUzqZ8
How did you implement it?
How do you run and use it?
The exact workflow of running all these tests is to simply do airstack up and then create a test based on your needs using the global CLI options. Some basic tests that I ran to validate this testing stack is included below. The global CLI options are also included below.
Testing with PyTest
airstack test -m ...A maintainer should see that all the tests have passed in their console once they input an airstack test command and they should go to the testing folder, isolate the folder that has their test and open their summary.txt file for the test in question to see all the outputted metrics from the test.

Documentation
Yes, the mkdocs.yml was updated to make a trajectory testing page.
Yes, there is now a docs that explains the full pipeline for trajectory testing.
I believe that there is sufficient visual media from the YouTube Video above but I can generate more if needed.
Versioning
.envfile according to semantic versioning?Yes, the versioning was changed to 0.19.0-alpha.4.